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A framework was developed for the optimization and validation of empir- 
ical dynamic models subject to an arbitrary set of validation criteria. The 
validation requirements imposed upon the model, which may involve several 
sets of input-output data and arbitrary specifications in time and frequency 
domains, are used to determine if model predictions are within admissible 
error limits. The parameters of the empirical model are estimated by find- 
ing the parameter realization for which the smallest of the margins of re- 
quirement compliance is as large as possible. The uncertainty in the value 
of this estimate is characterized by studying the set of model parameters 
yielding predictions that comply with all the requirements. Strategies are 
presented for bounding this set, studying its dependence on admissible pre- 
diction error set by the analyst, and evaluating the sensitivity of the model 
predictions to parameter variations. This information is instrumental in 
characterizing uncertainty models used for evaluating the dynamic model 
at operating conditions differing from those used for its identification and 
validation. A practical example based on the short period dynamics of the 
F-16 is used for illustration. 
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d Number of samples 

e Error 

e Admissible error 

g Requirement functions 

J Dimension reduction error 

J Cost function used for the maximum likelihood 
L Likelihood function 

m Aspect vector 

/i T . Mean dimension reduction error 

n Number of parameters in 6 

n g Number of requirement functions in g 

0 Parameters of the empirical model 

0 Maximal-margin parameter estimate 

0 C Geometric center of the bounding set 

6 Maximum-likelihood parameter estimate 

6 Critical parameter value (CPV) 

p Parametric safety margin (PSM) 

R Noise covariance matrix 

R Maximum likelihood noise covariance matrix 
t Time 

d Standard deviation of 9 

E Parameter covariance 

u(t) Input to the empirical model 

V Validation domain in d-space 

Vol Volume operator 

w Worst-case requirement function 

ay Maximal dimension reduction error 


I. Introduction 

Consider the linear time invariant (LTI) model described by 

x(t) = A(9)x(t) + B(9)u{t) (1) 

y (t) = C(6)x(t) + D(6)u(t ) (2) 

where 9 G is the vector of model parameters whose value is unknown, t G M is time, 
x(t) G M na: is the state, x(0) = xo is the initial condition, u(t) G is the input, y(t) G M n!/ 
is the output, A G M n * xn *, B G M n * xn “, C G M n !/ Xn3: and D G M n J/ xn ' i are the system 
matrices. The output of the physical system modeled by Eqs. (1-2) is measured at discrete 
time instants t t . These measurements are described by 


z(U) = y(U ) + v(U) (3) 

where z(ti ) G is the measured output for * = 1, . . . N, and v(ti) G M. ny is measurement 
noise. In aircraft dynamics, the parameters in 9 usually are stability and control derivatives. 
System identification entails using the u(U) and z(t t ) corresponding to a carefully chosen 
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maneuver to estimate the value of 9, and thereby identify the dynamic model. For most 
aircraft system identification, it is assumed that the input u is measured without error. 
Thus, we will denote by u(ti) the measurement of u(t) and by z( U) the noisy measurement 
of y(t). The input and output set and z{tj) corresponding to the system identihcation 
experiment will be referred to as the system identification data set. 

For traditional output-error parameter estimation, it is assumed that the measurement 
noise u is a white zero-mean Gaussian process, and no other source of uncertainty is present. 
The unknown parameter vector 9 is estimated by minimizing a weighted sum of squares of 
the difference between measured output z and model output y, in a maximum likelihood for- 
mulation. This formulation yields both the estimated parameter vector 9, and an ellipsoidal 
uncertainty set A e C M" centered at 9, which is expected to contain the true value of 9 with 
a specified level of confidence. This set is often approximated by A, a hyper-rectangle whose 
sides are the confidence intervals of the parameter estimates. Hence, the identified dynamic 
model consists of Eqs. (1-2), the estimated parameter 9, and the uncertainty set A. 

The identified model is often validated using an independent flight test maneuver. The 
corresponding data set, to be called a validation data set, consists of the input u v (ti) G 
and the output z v (t t ) G K riy , for i = 1 .... Ay . The validation of the identified model 
depends on its ability to predict z v given u„ . In conventional aircraft system identification, 
model validation is done heuristically without mathematical formalism. The identified and 
validated empirical model, along with probabilistic descriptions of uncertainty based on A, 
are used in Monte Carlo analyses to study the probable range of performance for the physical 
system under different sets of operating conditions and/or different inputs. 

This paper proposes a new approach to parameter estimation and dynamic model val- 
idation based on post-flight analysis. The framework proposed, which integrates multiple 
requirements corresponding to arbitrary performance metrics and/or several input-output 
data sets, is applicable to dynamic models with arbitrary functional forms, e.g., a nonlin- 
ear dynamic model with nonlinear parameterization. For example, one requirement may be 
based on the difference between the measured output used for system identification, z, and 
the predicted output y(6, t, u ) using the / 2 - n orm of the difference as a performance metric, 
whereas another may be based on the maximum offset between the validation output z v 
and the predicted output y(9,t,u v ) using the ^-norm of the difference as a performance 
metric. Any value of 9 leading to sufficiently small differences is an admissible realization 
of the model parameters. The collection of all of these realizations constitute the valida- 
tion domain. Hence, the system in Eqs. (1-2) evaluated at any member of the validation 
domain yields a prediction that meets both validation requirements. The size and shape 
of the validation domain characterizes the uncertainty in the parameters of the empirical 
model. In this setting, parameter estimation is performed by identifying the realization of 
the validation domain for which the smallest margin of requirement compliance is as large 
as possible. This represents an optimized dynamic model relative to the selected validation 
criteria. The validation domain plays a key role in formulating uncertainty models used by 
Monte Carlo analyses. Other uses include comparing flight results with ground-based pre- 
dictions, calibrating computational fluid dynamics calculations, stability and control flight 
testing and flight envelope expansion, accident investigation, adaptive and learning control, 
and fault detection. 

This paper presents techniques for calculating parameter estimates given several valida- 
tion requirements, bounding the validation domain, studying its dependence on admissible 
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prediction error set by the analyst, and evaluating the sensitivity of the model predictions 
to parameter variations. This paper is organized as follows. Section II presents a summary 
of current practice in aircraft system identification and uncertainty analysis using the time- 
domain output-error method. Section III presents the proposed model validation technique. 
The global sensitivity analysis of the model predictions is covered in Section IV. Section 
V discusses practical implementation issues. Section VI illustrates in detail the scope of 
the methodology using a simple example. This is followed by Section VII, where a brief 
comparison between the legacy and proposed strategies is made. Finally, Section VIII closes 
the paper with a few concluding remarks. 

II. Current Practice 

The elements of the unknown parameter vector 9 in Eqs. (1-2) can be estimated from 
measured data using an output-error formulation in the time domain. In this approach, 9 is 
chosen to minimize the cost function 

1 N N 

J ( 6 , u,z,R) = - ^2 [. z(U ) - y(6, U, u)] T iT 1 [z(U) - y(9, U, u)] + — ln(|.R|) (4) 

2=1 

where R G R" y xn,J is a weighting matrix. If R is known, then only the term involving the 
summation in Eq. (4) is relevant for the optimization, with the elements of the 9 vector 
being the only unknowns. The cost function is then the weighted sum of squared differences 
between the measured output z and the model output y for the given input u, 

1 N 

e 2 (9, u,z,R) = - [z(U) - y(6 , U, u)] T i? _1 [z(U) - y(6 , U, u)] (5) 

2=1 

This formulation is called output error because the unknown parameters are chosen to min- 
imize the weighted squared errors between the measured output and the model output. 

The weighting matrix R in Eq. (4) is in general arbitrary, but if the noise u is assumed 
to be a sample from a stationary, zero-mean, white Gaussian process, and R is chosen as 
the corresponding discrete noise covariance, E[z/(tj)^(tj) T ] = RS,,j , then the minimization 
of the cost in Eq. (4) is equivalent to the maximization of a Gaussian likelihood function, 
L(9,u,z,R), defined as[l] 

L(6,u,z,R) = jz exp {-( '>'[.:(/,) - R~' t.. u) 1 (6) 

[(2ir)"»|B|] T { 2 tt J 

The cost function in Eq. (4) is the negative logarithm of the likelihood function given in 
Eq. (6), except for a constant term which has no bearing on the optimization. Maximizing 
the likelihood function is equivalent to minimizing its negative logarithm. Therefore, the 
maximum likelihood estimate 6 is found as 

9 = argmax L(9, u, z, R) = argmin J(9, u, z, R) (7) 

e 0 

Maximum likelihood estimates have several desirable properties as the number of data points 
N gets large, such being asymptotically unbiased (estimate approaches the “true” value), and 
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having repeated estimates approach a Gaussian distribution with variances that approach 
the theoretical minimum. 

Because of the nonlinearity of the cost function in Eq. (4), along with the generally 
coupled dynamics in Eqs. (1-2), the output-error cost function depends nonlinearly on the 
unknown parameter vector 9 and also on the noise covariance matrix R. Consequently, a 
nonlinear optimization algorithm must be used to solve the problem. Typically, a relaxation 
method, 1 that combines the modified Newton- Raphson method (also called Gauss-Newton 
method) and the simplex method, can be used to minimize e 2 in Eq. (5), initially with R 
assumed equal to the identify matrix. This yields the estimate 9. The maximum likelihood 
noise covariance, R, which minimizes the cost in Eq. (4) with respect to R for fixed 6, is 
given by the closed-form expression[l] 

1 n 

R=-fi'52[z(ti)-y(0,ti,u)] [z(U) - y (9,ti,u)] T (8) 


The relaxation method alternates between the nonlinear optimization used to find 9 using 
Eq. (5) and the evaluation of R using Eq. (8), until both estimates converge using selected 
convergence criteria. 1 Although there is no formal proof that the relaxation method will 
converge, extensive practical experience has shown that it does converge. This procedure has 
been mechanized in software and used successfully for a wide variety of aerospace dynamic 
modeling problems. Reference [1] provides further theoretical and practical details on the 
output-error parameter estimation method and the solution procedure outlined here. The 
output-error solution yields maximum likelihood estimates for both the unknown parameter 
vector 9 and the noise covariance matrix R. 

The procedures for calculating the parameter covariance matrix E, and an uncertainty 
for 9 are presented next. The diagonal elements of E are the variances of the parameter 
estimates whereas the off-diagonal elements are the covariances. The Cramer-Rao lower 
bound for the covariance matrix, also called the dispersion matrix D e M nxn , is defined as 

(9) 


D = 


N 


Y j S t (U)R~ 1 S(U) 


1=1 


where S G M nyXTi , called the output sensitivity matrix, quantifies the sensitivity of the model 
outputs to the model parameters at the maximum likelihood estimate: 


S(U ) 


dy(U) 


d9 


0=0 


( 10 ) 


Output sensitivities can be calculated analytically by solving a set of sensitivity equations 
derived by differentiating Eqs. (1-2) with respect to 9, or by applying a finite difference 
technique to Eqs. (1-2), see Ref. [1] for details. The first-order approximation of the 
sensitivity of the model output to uncertainty is given by the product of S and the range 
of uncertainty. If the columns of the output sensitivity matrix S are non-zero and have 
low pairwise correlations, then each model parameter has a distinguishable effect on the 
output y(t), and Eq. (7) is a well-conditioned optimization problem leading to an accurate 
parameter estimate with small uncertainties. 
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The connection between the Cramer-Rao bound D and the input u is complicated and 
nonlinear, even when the dynamic model is linear, as in Eqs (1, 2). Measured output noise 
levels (quantified by R) also affect D. Roughly speaking, D is determined by squared noise 
to signal ratio, where R represents the noise variance (square of the noise level), and the 
output sensitivities represent the signal strength for the parameter estimation problem. Be- 
cause the covariance matrix E for maximum likelihood estimates asymptotically approaches 
the Cramer-Rao lower bound D as N increases, it is common practice to assume that the 
parameter covariance matrix E is approximately equal to D evaluated at the maximum like- 
lihood estimate, i.e., EwD. Modern flight instrumentation systems collect flight data at 
50 Hz or more, so that even relatively short maneuvers have a large number of data points 
N. This fact is used to justify the approximation E ~ D. 

As the number of data points increases, maximum likelihood parameter estimates are 
asymptotically Gaussian with mean value equal to 9 and covariance equal to E. The corre- 
sponding probability density function is 

The hyper-ellipsoidal uncertainty set corresponding to the (1 — a)% confidence level is then 

A e (M,E) = { 9 : (9-0) T E-^e-e) <r,\a )} (12) 

where 9 is the center of the ellipsoid, and the value of //(a) can be determined from 1 — a = 
erf(?7/\/2). The values of 9 for which the equality in Eq. (12) holds are level curves of the 
density function in Eq. (11). The normalized eigenvectors of E are the principal axes of 
the ellipsoid in Eq. (12), and the corresponding eigenvalues are the squared inverse of the 
corresponding semi-axis lengths. A common choice is to use the 95% confidence intervals, for 
which 77(0.05) ~ 2. A confidence interval is an interval that includes the parameter of interest 
with a given level of confidence for repeated experiments. More specifically, if confidence 
intervals are constructed across many separate data analyses of different experiments, the 
proportion of such intervals that contain the true value of the parameter will match the 
confidence level. This is guaranteed asymptotically as more data is available. 

The hyper-ellipsoidal uncertainty set in Eq. (12) can be roughly approximated by the 
hyper-rectangular uncertainty set 

A ( 9 , a, E) = [(9i - r?(a;)d-i, #1 + ??(«)di] x . . . x [9 n - r/(a)a n , 9 n + r/(a)a n ] (13) 

where 07, j — 1, . . . n, are the standard deviations of the estimated parameters. Each term 
on the right hand side of Eq. (13) corresponds to the (1 — a)% confidence interval of a 
component of 9 for the case in which the parameters are uncorrelated. Hyper-rectangles 
have the advantage that the range of an individual parameter is not affected by the ranges 
of the other parameters. This is in contrast to bounding sets with other geometries, such as 
ellipsoids. In general, none of the sets in Eqs. (12) and (13) contain each other. 

The standard procedure in aircraft system identification for many years was to quote the 
maximum likelihood estimates of the parameters 9, with 95% confidence intervals computed 
based on D, and the Gaussian noise assumption. However, in practice, it was found that 
model residuals (difference between measured output and model output) are typically not 
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white Gaussian with constant variance, as assumed in the theory, but in fact are colored (time 
correlated) with nonuniform variance. This occurs because in practical problems, the model 
residuals include not just measurement noise, as assumed in the theory, but also model 
structure errors from effects not included in the model, such as nonlinearities, structural 
responses, and unsteady aerodynamic effects. The mismatch between theory and practice 
results in estimated parameter uncertainty bounds that are overly optimistic (too small) 
when using the standard procedure described above. Reference [2] provides some heuristic 
techniques to address this problem. References [1,3,4] provide a rigorous solution, validated 
using both simulation data and flight test data, which involves improving the R estimate, 
and using it to compute a corrected D matrix. This solution has become the dominant 
approach in recent years, because standard errors computed in this way match the scatter in 
parameter estimates from repeated experiments at the same flight conditions. The necessary 
correction can be done as a post-processing of the residual sequences from an output-error 
maximum likelihood solution, because only the D matrix elements (and consequently the 
estimated parameter uncertainties) are affected, but not the parameter estimate 9. 

Further details on the material in this section can be found in Ref. [1], which also includes 
MATLAB® software that implements the procedures described here, along with many other 
procedures and techniques related to aircraft system identification. 

III. Parameter Estimation and Validation Domain Bounding 

This section presents techniques for calculating a parameter estimate and bounding the 
validation domain. Recall that the dynamic model evaluated at any parameter realization 
of this domain yields predictions that satisfy the validation requirements. The validation 
domain is, in general, geometrically complex and multi-dimensional. These features often 
preclude its exact determination and analysis. Techniques for calculating inner bounding sets 
(subsets) and outer bounding sets (supersets) of this domain are presented below. While the 
inner bounding sets are guaranteed to consist only of validation points, the outer bounding 
sets are guaranteed to contain all the members of the validation domain. Some of the uses 
of these bounding sets are listed in Section VII. For purposes of this paper, these bounding 
sets are hyper-rectangles having edges that are parallel to the coordinate axes of d-space. 

The dependency of the dynamic model in Eqs. (1-2) on 6 makes the stability and per- 
formance metrics derived from this model dependent on 9. The error measure in Eq. (5) 
and 

e«> (9,u,z,R) = max | \z(ti) -y(ti,0,u)] T R~ l [z{U) - y(t u 9, u)]J (14) 

which is an W-norm, are examples of these metrics. Metrics like these will be used to 
prescribe a set of validation requirements. A dynamic model will be regarded as adequate 
when the corresponding model predictions comply with all the requirements. A requirement 
will be prescribed as an inequality constraint on the value of the error between the target 
behavior and the model prediction. If e{9) denotes such an error, and e is the largest 
admissible value of e(9), this inequality can be described as 

g(9,e) = e(9) -e< 0 (15) 

The dynamic model is compliant with the requirement if and only if the inequality is satisfied. 
An extension to the multiple requirement case can be easily made. The multiplicity of 
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requirements may stem from having multiple error forms or from having multiple data sets. 
For instance, the satisfaction of the vector inequality 

g(0,e) = [e 2 (0, u, z, R) - e 1 ,e oo (0,u v ,z v ,R) - e 2 ] < 0 (16) 

means that the square of the weighted I 2 difference between the predicted data and the 
measured data does not exceed ei, and that the weighted /oo-norm of the difference between 
the predicted data and the validation data does not exceed e 2 . 

The vector inequality g(9, e) < 0, where g G R r ' 9 , partitions the space of model parameters 
0 into two regions - a region where all requirements are met, and a region where at least one 
of them is violated. The validation domain V, or requirement-compliant domain, defined as 

n g 

vw = H {»:»(». €j)<0} (17) 

3 = 1 

is comprised of all parameter realizations leading to predictions meeting all of the validation 
criteria. Each term on the right hand side of this equation defines the validation domain 
corresponding to an individual requirement. An alternative expression for the validation 
domain is V(e) = {6 : w(9, e) < 0}, where 

w(6,e) = max gj{0,6j) (18) 

!<?<«■<) 

is the worst-case requirement function. This formulation allows describing V through a 
single inequality constraint. The complement of the validation domain, C[V], where C[-\ 
denotes the set complement operator, will be called the requirement-noncompliant domain. 
Parameter realizations within this domain will violate at least one of the requirements. 

The underlying logic supporting the calculation of the uncertainty set in Section II is 
to bound the “true” value of 6 via confidence intervals, whereas the rationale used here is 
identifying the infinitely many realizations of 6 for which the model predictions are adequate; 
e.g., the realizations of 6 for which the offset between the measurements and the models’ 
prediction is within admissible error bounds. 

This formulation can also be used to impose mathematical attributes, not necessarily 
described by the system identification or validation experiments, upon the dynamic model. 
For instance, a local stability requirement for the open-loop system can be imposed by 
including in g the requirement function max{9ft[Ai(0)], . . . , 9ft[A na .(0)]} — e, where A k,k = 
1, . . . n x are the eigenvalues of A(6), $R[-] is the real part operator, and e = 0. In the setting 
of Eq. (16), the parameter realizations within the validation domain yield locally stable 
dynamic models whose predictions adequately describe the measured outputs. Another 
requirement that often comes up in practice is that a particular model parameter should 
have a specific sign. If 9 a is to be negative, include g : j(0. ef) = 9 a in the g vector. For a 
positive 9 a , use gj(9,ej ) = — 9 a . 

The value of —w(0, e) can be interpreted as a margin of requirement compliance associated 
with 9. In this setting, the parameter estimate 9, to be called the maximal-margin estimate , 
is defined as 

9 = argmin {w(9, e)} (19) 

e 

Therefore, the empirical model evaluated at 9 yields predictions that maximize the smallest 
margin of requirement compliance, i.e. , 9 makes the value of the largest component of g is as 
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small as possible. If w(6, e) < 0, then 9 is an element of the validation domain. Otherwise, 
V(e) is empty. 

Techniques for calculating inner bounds require a non-empty requirement-noncompliant 
domain, whereas those for generating outer bounds require that V be bounded. Figure 1 
shows a sketch for n = 2, where the boundary of V is shown as a solid black line. Two 
rectangular inner bounding sets, denoted as Bi nner and £>i nner , and two outer bounding sets, 
denoted as B ou ter and £> oute r , all centered at 6 C , are also shown. Note that £>i nner and £> oute r 
are, respectively, the minimum and maximum area rectangles bounding V having 9 C as their 
center. £>i, mcr and £> outcr are the optimal squares. 



Figure 1. Bounding sets of the validation domain 

The m-scaled infinity norm is instrumental for calculating hyper-rectangular bounding 
sets. For a vector m € M n with positive components, the m-scaled infinity norm of a € R” is 
defined as ||a||“ = max{|afc|/m.fc}. The vector m, called the aspect vector , sets the orientation 
of the positive semi-diagonal of the hyper-rectangle. For example, the aspect vector of 
Binner and £> ou ter in Figure 1 is m = [1,1], while B j nn eT and B llUter use an aspect vector 
that maximizes and minimizes the bounding set area, respectively. Reference [5] provides 
background information for the developments that follow. 

A. Subsets of the Validation Domain 

Formulations for calculating hyper-rectangular subsets of V are presented below. While 
empirical models evaluated at any point from within such a subset will comply with the 
validation criteria, models evaluated outside this set may or may not satisfy these criteria. 
We present two formulations, one for a fixed aspect vector m and another which calculates 
an optimal aspect vector m. The optimal bounding set resulting from the latter formulation 
minimizes the volume of the uncovered portion of the validation domain. 
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1. Fixed Aspect Vector 


Techniques for calculating the largest hyper-rectangle centered at 9 C with aspect vector m 
that is fully contained in V are presented next. If g(0, e) < 0 is the set of requirements 
imposed upon the dynamic model, the hyper-rectangle 

dinner {9ci tU • Pinner) {9 . 9 C Pinner^ ^ 9 9 C T Pinner^} (20) 

where 9 C G V, with Pi nne r specified as 


pinner(e) = min (HP - 9 C \\™ : w(0,e) > 0} (21) 

0 

is a subset of V. The calculation of a p, called the parametric safety margin (PSM), as in Eq. 
(21), entails solving a nonlinear optimization problem. A value of 9 at which the optimum 
of Eq. (21) occurs, 9, is called a critical parameter value (CPV). A CPV is a parameter 
realization where the bounding set and the requirement-noncompliant domain touch. The 
small yellow circles shown in Figure 1 show where the CPVs are. The components of g 
taking the value of zero at a CPV are the critical requirements. The PSM, each CPV, and 
the critical requirements are all functions of the geometry of V, which in turn depends on 
the admissible error e. 

The values of the constants e, 9 C , and rn must be set before solving Eq. (21). The 
value of the components of admissible error vector e should be set according to the nature 
and importance of each requirement. Some requirements allow for a somewhat subjective 
selection of e (e.g., allowing for a 10% larger admissible error may still yield a sufficiently 
accurate prediction), whereas others may not (e.g., requiring the dynamic model to be locally 
stable demands an e = 0). For the subjective cases, the sensitivity of V to e can be explored 
by calculating the monotonically increasing and possibly discontinuous function Pi nner (e). 
Discontinuities in Pi„„ cr (e) may arise when the iv(9. e) for a fixed e has several local maxima. 
Additional information on pj imcr (e) is provided in Section III-C. The values of 9 C and m, which 
prescribe the geometry of the bounding sets, can be chosen in multiple ways. The value of 
9 C should yield an accurate prediction, and the value of /?q. should be made proportional to 
the level of uncertainty in the value of 9^. A natural choice for the center of the bounding 
sets is 9 C = 9. The conventional approach of Section II can be used to set m = a. A poor 
selection of 9 C and m may yield no solution for an inner bounding set (e.g., a case where 
9 C = 9 is outside the validation domain), or may yield unnecessarily loose bounding sets. 
The subjectivity in the selection of 9 C and m can be eliminated using the strategies proposed 
in the next section. 

Imposing additional requirements or lowering the admissible error e (or both) cannot in- 
crease Pinner- Rather, these actions will shrink the validation domain, and thus, the associated 
bounding sets. Note that the combination of several requirements, not necessarily stringent 
by themselves, may lead to an empty V. So, it is possible to impose two requirements each 
leading to a satisfactorily large validation domain by itself, but whose intersection is com- 
paratively small. This would imply that the sets of parameter values in which the model 
satisfies either of the two requirements is large, but the set in which both requirements are 
satisfied is small. 
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2. Optimal Aspect Vector 


Techniques for calculating the hyper-rectangle centered at 9 C of largest volume within V 
are presented next. In contrast to the fixed-m formulation, now the optimal value of m is 
determined in an outer optimization loop 


max { Vol(£?i nner ) } 

m 

The resulting bounding set is given by Eqs. (20) and (21), where 


( 22 ) 


m h 


,(e) = argmax i finerM “t. IMI = 1. > 0 


(23) 


k = 1 


is the optimal aspect vector. The PSM and bounding box corresponding to mi nner will be 
denoted as Pmner (e) and £>i nncr , respectively. Note that finding mi nner requires solving a nested 
optimization problem where a search for the PSM for a fixed m occurs in the inner loop, 
while a search for the optimal aspect vector m occurs in the outer loop. As before, the 
resulting hyper-rectangle is an inner bounding set of V. For the same 9 C , the bounding set 
resulting from this formulation is better than the fixed-m formulation in the sense that the 
volume of the validation domain not covered by the bounding set is the smallest. Figure 1 
shows the optimal bounding set £> inner as a red rectangle, and the corresponding CPVs as 
yellow circles. 


B. Supersets of the Validation Domain 

In this section, techniques for finding supersets of V are developed. The empirical model 
evaluated at any point outside this superset is requirement-noncompliant, whereas the model 
evaluated at points inside this superset may either be requirement-compliant or requirement- 
noncompliant. As before, two formulations are presented - one for a fixed m, and another 
for an optimal m. The optimal bounding set resulting from the latter formulation contains 
all the parameter realizations that comply with the requirements, with minimal excess, i.e., 
without extending unnecessarily over the requirement-noncompliant domain. 

1. Fixed Aspect Vector 

A formulation for calculating the smallest hyper-rectangle centered at 9 C with fixed aspect 
vector m that contains V is presented next. If g(9, e) < 0 is the set of requirements imposed 
upon the dynamic model, the hyper-rectangle 

Pouter (9 C , TIL p) { 9 . 9 C Pouterttl Vi 9 V 9 c T Pouter^} (24) 

with Pouter is specified by 

Pouter (c) = max{||0 - 9 C ||“ : g(9,e) < 0} (25) 

0 

is a superset of V. As before, the calculation of the PSM requires solving a nonlinear 
optimization problem. By construction, the family of empirical models resulting from varying 
9 over C [B ou ter] will violate at least one of the requirements. The PSM, the CPV, and 
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critical requirements introduced in Section III-A can be naturally extended to the supersets 
considered here. Figure 1 shows the outer bounding set for m = [1, 1], labeled B on te r, as a 
blue rectangle, and the corresponding CPV as a yellow circle. 

As before, the values of 9 C , m , and e should be prescribed according to engineering 
judgement. Although the center of the inner bounding set 0 C must be a point within the 
validation domain (otherwise the bounding set would not be contained by V), the center of 
the outer bounding set can be anywhere. Actually, if the validation domain is a disconnected 
set, the selection of a 9 C outside the validation domain can yield smaller bounding sets, e.g., 
the smallest rectangle that contains two non-intersecting circles of equal diameter has a 
center that is outside both circles. 

As with the inner bounding sets, the sensitivity of V to e can be studied by calculating the 
monotonically increasing and possibly discontinuous function Pouter (e)- These discontinuities 
are the result of w(9,e) having multiple local minima. Additional information on p outcr (c) is 
provided in Section III-C. 


2. Optimal Aspect Vector 

Techniques for calculating the hyper-rectangle centered at 9 C of smallest volume that fully 
contains V are presented next. In contrast to the fixcd-m formulation, the optimal value of 
m is determined in an outer optimization loop 


min { Vol ( i? 0 uter ) ) 


The resulting bounding set is given by Eqs. (24) and (25), where 

Pouter (e) = argmin < Po Uter (e) JJm fc ,||m|| = l,m>0 


k = 1 


(26) 


(27) 


is the optimal aspect vector. The PSM and bounding set corresponding to m outcr will be 
denoted as Pouter (e) and <8 ou ter, respectively. As before, finding m oute r requires solving a 
nested optimization problem where a search for the PSM for a fixed m occurs in the inner 
loop, and a search for the optimal aspect vector m occurs in the outer loop. For the same 
9 C , the bounding set resulting from the free-m formulation is better than that of the fixed-m 
formulation in the sense that the volume of the requirement-noncompliant domain covered by 
the outer bounding set is the smallest. Figure 1 shows the optimal superset (green rectangle) 
and the corresponding CP Vs (yellow circles). Note that multiple CPVs lay on the boundary 
of both inner and outer optimal bounding sets. 

The free-m formulations for calculating bounding sets of V can be trivially extended to 
the case where the center of the bounding set 9 C , along with m, are determined in the outer 
optimization loop. This practice leads to hyper-rectangular bounding sets where the offset 
between them and V is minimal. This class of bounding sets will not be pursued here. 


C. Dependency of the Bounding Sets on the Admissible Error 

This section considers possible complexities of V, the bounding sets, and the PSMs, as 
the admissible error e prescribing the requirements changes. A simple example is used to 
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illustrate the case where the monotonically increasing function p(e) corresponding to both 
inner and outer bounding sets of V is discontinuous. 

Figure 2 shows an error function e(9), defined by the gray line, for a single unknown 
parameter 9. A piece-wise linear e(9) has been chosen to facilitate the exposition. In this 
case the bounding sets are intervals centered at 6 C , which is the global minimum of the error 
function. Note that the inequality g = w = e(9) — e < 0, which solely defines the validation 
domain, is the singleton 6 C when e = 0. Larger values of e yield validation domains with 
infinitely many points. The validation domains corresponding to a continuum of e values is 
colored in green. The validation domain for ei comprises two intervals, whereas the validation 
domain for e 3 is a single interval. The colored lines in the left figure represent the endpoints 
of the inner bounding interval of V (dashed red line) and the outer bounding interval of V 
(dashed blue line). The half length of the intervals is the PSM. 

The right figure shows the PSMs pi nner and p ou ter as a function of e. The local maxima 
and minima of e(8) yield discontinuities in Pmner(e) and p ou ter(e), respectively. As expected, 
Pouter > Pinner for all c. The functions Pinner (c) and p ou ter (e) relate the maximum value of the 
worst-case error (abscissa) that occurs within a box centered at 8 with aspect vector m of a 
given size (ordinate). In this particular example, these functions relate the maximum value 
of e(9) that occurs within an interval centered at 9 having a half length equal to the PSM. 
The discontinuities in the PSM are cases where a differential change in the admissible error 
e yields a significant change in the validation domain, and thus, in the bounding sets. In 
the case shown, each discontinuity is associated with V(e) changing between being simply- 
connected and being not simply-connected. 



Po liter ** 


Pinner 



fo e 3 


e 


Figure 2. PSM of inner and outer bounding sets of V as a function of e 

The following section examines the global sensitivity of all the components of g, which 
may well include metrics based on the output y(t), to the model parameters as their values 
are free to range within the validation domain. The understanding of this section is not 
necessary for the reminder of this paper, and as such, it may be considered optional. 
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IV. Global Sensitivities 


The output sensitivity matrix S in Eq. (10) and the parameter covariance matrix E in Eq. 
(9) provide valuable information on the sensitivity of the predicted output y(t) to variation 
in the model parameters. If a particular parameter sensitivity is poor, then that parameter 
is dropped or fixed as part of the standard procedure in the system identification problem. 
This well-established approach is based on the local sensitivity of the identified model output 
y(t) to the system identification input at the single parameter point 9. Performance metrics 
and requirements that are not based on y(t) will likely exhibit different sensitivities, e.g., the 
sensitivity of the first natural frequency of the identified model to 9. Moreover, derivatives 
about other 9 points cannot be expected to be in agreement with those at 9. In general, 
there is no basis to expect that the results of a global sensitivity analysis will be in agreement 
with those of a local sensitivity analysis, unless the range of variation in 9 is small, or the 
figures of merit prescribing all the components of g are a weakly nonlinear function of 9. 

Up to this point, the problem of assigning values to 9 such that the system model satisfies 
all requirements has been considered. The acceptability of candidate values of 9 is based on 
an error metric which measures, for instance, how closely the model output matches the test 
results when the candidate value is used for the model parameters. In this section, methods 

are explored for determining if, from the point of view of matching test results, the model is 

so insensitive to the value assigned to specific components of 9 that, for analysis purposes, 
those components may be assigned fixed values without materially changing the quality of 
the prediction. If such insensitive parameters can be identified, the dimensionality of the 
parameter identification problem is effectively reduced. Similar ideas are used in Ref. [6]. A 
mathematical formalism to describe these notions is presented next. 

The effect that the value of a subset of the model’s parameters has on the quality of the 
model’s predictions is studied next. To this end, we partition the parameter vector 9 into two 
subvectors, the vector / of parameters whose range of variation within V will be preserved, 
and the vector r of parameters whose value within V will be kept fixed. This partition will 
be represented as 9 = [ f,r ]. Denote by T and 7Z the projections of the validation domain 
onto the /- and r-subspace respectively 

T = {/ : 3r for which [/, r] G V} (28) 

7 Z = {r : 3f for which [/, r] G V} (29) 

The predictions made by the dynamic model are insensitive to r when g([f,r],e) ~ 
g([f,r],e) for all [/, r] G V, where r is any fixed value of r in TZ. A metric for evaluating 
the approximation error incurred by using g([f,f\,e) instead of g([f,r],e) is the dimension 
reduction error J(r), defined as 

J(r) = j f [^([/u],e) - g([f,r],e)] T Q\g([f,r],e) - g([f,r\,e)\dfdr (30) 

where Q is a positive definite matrix. The matrix Q can be used to scale the components of g 
according to the units of the error e prescribing it, and to emphasize particular requirements. 
A small value of J(r) for all r G TZ indicates global insensitivity of g, thus of e and y, to the 
value that r might take in 7 Z. 

A couple of metrics for evaluating key features of the function J(r) as r ranges over 7 Z 
are introduced next. The first of these metrics the maximal dimension reduction error, uy, 
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given by 


r j(f) 

^ =m r|^w 


: r e 71 


(31) 


The metric uy is a tight upper bound on all possible dimension reduction errors that may 
occur by fixing the value of f in TZ. The second metric is the mean dimension reduction 
error , p r , given by 

^ = L ^m df (32) 

While uy depends on the extreme and possibly unlikely case at which J(f) attains is maxima, 
fi r is the average value of J{f) when r is uniformly distributed over 71. The sensitivity metrics 
uy and can be used to rank the model’s parameters and determining which of them can be 
fixed at a constant value. The best of such values is the point where J attains its minimum. 
Note that rankings based on ay and p r may not agree. Further notice that fixing the value 
of the parameters in f effectively reduces the dimension of the unknown parameter space 
from n to dim (/). 

The sensitivity metrics uy and //,. can be evaluated using bounding sets of V. Details on 
how to evaluate them numerically are provided next. Let B (9,rh,pj be a superset of V(e). 
The partition 9 = [/, r] leads to B = B* x B r , where B* — {/ : / — pm* < f < f + pm?}, 
B r = {r : r — pm r < r < f + prh r }, rh = and 9 C = [f c ,r c \. In this setting, the 

sensitivities in Eq. (31) and Eq. (32) are given by 


1 


uy = 


max 


Vol(7£) fee 



B r JBf 


h(f, r, r)/y([/, r])dfdr 


(33) 


1 


/i r 



h(f, r,f)/ v ([/, r])dfdr 


Vol(7£) J fj, J g f 

where h(f,r,r ) is the integrand of Eq. (30), I is the indicator function, and 


(34) 


Vol(R) = / 

JB r 


7 K (r)dr 


(35) 


The indicator function I^(a) is equal to one when a E A, otherwise it is equal to zero. The 
integrals above can be evaluated using sampling. This practice leads to approximation errors 
which diminish only as 0(l/y/d), where d is the number of samples. This results in the usual 
diminished returns of ordinary statistical estimators: to halve the error in the estimate of uy 
we must quadruple d. While the integrals can also be evaluated using Rieman’s sums, the 
computational cost, which grows exponentially with n, restricts this approach to problems 
with small number of uncertainties. As expected, the sets resulting from the optimal-m 
formulations, which bound more tightly the validation domain, yield, for a given number of 
samples, more accurate estimates of the integrals, thus, of the sensitivity metrics. 

The sensitivity of the model prediction to a particular model’s parameter prescribes the 
effort we should devote to modeling the uncertainty on its value. Besides providing insight 
into the characterization of uncertainty, sensitivity information also enables establishing a 
basic notion of causality. By knowing which parameters are important, better system iden- 
tification experiments can be designed and critical requirements can be tightened or relaxed 
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accordingly. Furthermore, if the sensitivity metric for a particular parameter is close to zero, 
the corresponding component of the system matrices in Eqs. (1-2) can be fixed in advance 
and the validation domain of the new model, now in the dim(/)-dimensional subspace, can 
be studied. If the dynamic model is under-parameterized the validation domain correspond- 
ing to acceptable values of the admissible error e will be empty. When the validation domain 
is non-empty, the model can be either overly-parameterized or adequately parameterized. 
The model is overly-parameterized when the value of some parameters can be fixed and 
the model predictions corresponding to the validation domain associated with the remaining 
parameters are about as good as the original. The model is adequately-parameterized when 
fixing the value of any of the parameters yields a substantially degraded model prediction. 

V. Practical Implementation 

Guidelines for practical use of the framework proposed are given next. The following 
step-by-step procedure provides a high-level description of the tasks required to estimate the 
model’s parameters, characterize the uncertainty in their value, and validate the empirical 
model. 

1. Chose the structure of the dynamic model according to the physics governing the 
system, or identify an adequate model structure from the data, using model structure 
determination techniques. 1 The dependence of the model output on the states, controls, 
and parameter vector 9 can be arbitrary. 

2. Qualitatively prescribe the set of requirements to be imposed upon the model. Stability 
and performance requirements in the time and frequency domains, as well as constraints 
on the allowable range of variation for the parameter vector 9, can be included. These 
requirements can bound the offset between the observations and model predictions, or 
can force the model to satisfy desired mathematical properties. 

3. Prescribe the above requirements quantitatively. This tasks entails creating a g for each 
of the requirements such that g < 0 implies requirement satisfaction whereas g > 0 
implies its violation. Requirements bounding the offset between the observations and 
the model predictions can be cast as g = e — e, where the error e is any norm of the 
offset and e is the admissible error. Time and frequency domain specifications can be 
prescribed by setting allowable ranges of variation for key figures of merit, e.g., if t s (9) 
is the settling time to a step input and 1 is the maximum allowable settling time, use 
g = t s {9 ) - 1. 

4. Scale the components of g to make their range of variation comparable. Keep in mind 
the requirement functions have different units and describe different performance met- 
rics. Whereas scaling affects the parameter estimate 9, it does not affect the validation 
domain bounds since their calculation only depends on the sign of g. 

5. Find the maximal-margin parameter estimate 9 in Eq. (19). The best fixed-parameter 
empirical model is given by the structure set in Step (1) evaluated at 9. 

6. Find a parameter realization for which the empirical model satisfies all the require- 
ments. Denote this realization as 9*. 9* can be searched for by using sampling, or can 
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be made equal to the maximal-margin parameter estimate provided that w(6,e) < 0. 
Failing to find a value of 9 for which this inequality holds may indicate that the model 
structure does not capture the problem physics well, or that the requirements are overly 
stringent. If this is the case, revisit steps 1 through 3 to mend these deficiencies. The 
existence of 9* indicates that the validation domain is not empty. 

7. Calculate inner bounding sets of the validation domain using either Eq. (21) or Eq. 
(23) for 9 C = 9*. The latter formulation yields tighter sets. 

8. Calculate outer bounding sets of the validation domain using either Eq. (25) or Eq. 
(27) for 9 C = 9*. The latter formulation yields tighter sets. 

9. The family of validated dynamic models is given by the model structure prescribed in 
Step 1, and the bounds on the validation domain in Steps (7) and (8). While all the 
members inside the inner bounding set yield validated models, all the members outside 
the outer bounding set yield invalidated models. 

10. For a global sensitivity analysis use the developments of Section IV. This analysis 
enables (i) ranking the parameters in 9 according to the extent which they affect the 
model’s ability to meet the requirements, and (ii) determining which parameters can 
be assumed to take on a fixed value without significantly degrading the accuracy of 
the model prediction. 

11. Use the outer bounding set of the validation domain as the support set of uncertainty 
models for 9. This model is commonly used to study the variability on the model 
predictions caused by parameter uncertainty using Monte Carlo analysis. The larger 
the offset between the validation domain and its outer bounding set, the larger the 
conservatism in the analysis. 

A few remarks on the computational requirements of the above procedure are in order. 
Note that the optimization problems in Eqs. (7, 19, 21, 23, 25, 27), are nonlinear in general. 
These problems can be solved by most of the nonlinear programming algorithms available 
in numerical computation software. For the unconstrained problems in Eqs. (7, 19, 23, 
27), techniques such as Nelder-Mead (nonlinear simplex), trust-region, and quasi-Newton 
algorithms are applicable. For the constrained optimization problems in Eqs. (21, 25), 
sequential quadratic programming, interior-point, active-set, and trust-region reflective, can 
be used. In this work, fminsearch, fminunc and fmincon from MATLAB® were used. 

In any non-convex optimization problem there is always the possibility of convergence to 
a local optimum instead of the global optimum. While convergence to a non-global optimum 
in the unconstrained problems of Eqs. (7, 19, 23, 27), is not critical, this is not the case for 
the constrained cases in Eqs. (21, 25). When this anomaly occurs in the non-critical cases, 
the resulting sets are suboptimal bounding sets, e.g., there exists a larger inner bounding set. 
When this anomaly occurs in the critical cases, the resulting set is not an actual bounding 
set. Absolute guarantees of convergence to the global optimum are not possible, but a variety 
of algorithmic safeguards can be used to compensate for this deficiency. For instance, in the 
calculation of an outer bounding set, g can be evaluated at few sample points outside the 
resulting set. If g < 0 for any of these sample points, which will indicate that the calculated 
set is flawed, any of those sample points can be used as initial condition in subsequent 
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searches. Note that Monte Carlo analysis cannot be used to calculate bounding sets, or to 
prove that any set in the parameter space is indeed a bounding set. This is the result of not 
being able to assess the regions of the parameter space where there are no sample points. 
For instance, if the values w at selected sample points within the set £>mner are all negative, 
the possibility of w being positive in between any pair of sample points still exists. While 
finding one sample point in B for which w > 0 is enough to prove that B is not an inner 
bounding set of V, finding that w < 0 at any finite number of sample points is not enough 
to establish that B is contained in V. 

The computational effort of calculating bounding sets depends on the effort required to 
evaluate g, the number of uncertain parameters n, and the convergence properties of the 
nonlinear optimizer. This calculation may entail performing time simulations, calculating 
eigenvalues and/or frequency domain metrics, among many other possibilities. A detailed 
numerical analysis of the convergence to the optimum as a function of these parameters 
is beyond the scope of this paper. However numerical experiments based on the example 
below take a few hundred function evaluations of g, and about 4 minutes (2 x 3.2 GHz 
Quad-Core) to converge. Because this technology is to be applied post-flight, and not in 
real time, the CPU time associated with these calculations does not play a critical role. A 
concrete example, where the main ideas above are developed in detailed, is presented next. 


VI. Example 


An example case using a dynamic model with 6 unknown parameters is studied next. An 
LTI model for the short-period dynamics was extracted from the nonlinear F-16 simulation 
in [1] using central finite differences. A steady level flight condition for a = 5 deg was chosen. 
The model structure is the standard form given in Eqs. (1-2) where u is the deflection of the 
stabilator, x = [a, g] T , a is the angle of attack, q is the pitch rate, and y = x. The system 
matrices are 


m 


9\ $2 
9 a #5 


, B{ff) = 


,C = 


1 0 
0 1 


,D = 


(36) 


The system identification data and the validation data were generated using the nonlinear 
simulation with Gaussian noise for two different doublet input sequences. As a result, the 
system identification input u(ti) and output z(tj), as well as the validation input u v {t t ) and 
output z v (ti), are available. Following the standard approach of Section II, we calculate the 
maximum likelihood estimate 9 = [—0.6454, 0.9066, —0.1538, —3.7948, —1.2015, —6.5242], 
and the standard deviation a = [0.0214,0.0104,0.0208,0.0329,0.0265,0.0682], This infor- 
mation fully prescribes the uncertainty set A (9, 0.05) in Eq. (13). 

Validation requirements based on the error in Eq. (5) will be imposed upon the dynamic 
model. The first requirement is g\ = e s — e, where 


e s (9) 


e 2 {0,u,z) 
e 2 (9,u,z ) 


(37) 


e > 1, and u and z correspond to the system identification data set for R = R. This 
requirement implies that the error incurred by 9 must not exceed the minimum value by 
more than 100(e — 1) %. Note that the function e s (6), which depends upon six unknown 
parameters, has a minimum equal to 1 at 9 = 9. The normalization in (37) enables the 
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comparison of the e 2 error to the value attained by the maximum likelihood estimate. The 
left subplots in Figure 3 show 2-dimensional contours of e s (9) within the 95% confidence 
box A (9, 0.05) . Note that the geometric center of the box is both the maximum likelihood 



Figure 3. Contours of the error function for one (left) and two requirements (right). 

estimate 6 (marked with a + sign) and the maximal-margin parameter estimate 9 (marked 
with a x sign). Elements of 9 that are not varied in the figure are fixed at their maximum 
likelihood value. The largest error caused by uncertainty in d 5 and 9e within the rectangle 
pictured in the upper left is 27% larger than the error at 9, while the largest error caused by 
uncertainty in 9\ and 64 is 11% larger. Further notice that the contours depict the boundary 
of the validation domain corresponding to different values of e. Hence, the validation domains 
have an ellipsoid-like shape centered at 9 whose principal axes are fairly independent of e. 
The boundary of V(1.025) is shown as a black line in all subplots. 

Now consider two requirements, the one above and one based on the validation data set. 
The constraint corresponding to a second requirement is c / 2 — e v — e, where 

eM = e f' u °' z »\ (38) 

9 V = mhi 0 {e 2 (9,u v , z v )}, and u v and z v correspond to the validation data set. Note that 9 V 
is the maximum likelihood estimate corresponding to this data set. In general, the value 
of 9 V does not coincide with 9. The intent and structure of both requirements is the same. 
Furthermore, the same admissible error e will be used in both constraints. As with e Sl the 
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minimum of e v is also 1. This requirement implies that the variation in the error based on 
the validation data set incurred by 9 must not exceed the minimum attainable value by more 
than 100(e — 1) %. 

Figure 4 shows the measured and predicted angle of attack corresponding to the system 
identification data set (top) and the validation data set (bottom). The figure shows the 
measured output (blue line), which is contaminated by noise, and the model predictions 
for 3 values of 9. The predictions corresponding to 9 and 9 V , for which the prediction for 
each requirement is optimal, are shown in red and cyan, while those corresponding to 9 a , a 
value within A ( 9 , 0.05), is shown in black. The difference between the model predictions 
is barely visible. Figures 5 and 6 are used to better show such difference. Figure 5 shows 




Time [s] 


Figure 4. Input, measured output and model predictions for 3 sets of model parameters. 

the unfiltered and filtered versions of the error between the measured output and the model 
prediction for 9 (red-dashed line) and 9 a (black-solid line). The filtered versions result from 
using a low-pass filter to remove the high frequency noise from the measurements. The 
error functions corresponding to 9 V , which are practically indistinguishable from those of 
9, are omitted. Figure 6 shows the cumulative values of e s and e v for 9, 9 V and 9 a . By 
cumulative value we mean that the final integration time of the numerator in Eq. (37) and 
Eq. (38) is varied from 0 sec to 10 sec. Note that the values of e s and e v are equal to the 
terminal value of the cumulative errors. Further notice that dynamic models that describe 
well one of the measured outputs may be inaccurate predictors of the other one. For example, 
the dynamic model evaluated at 9 a exceeds the minimum matching errors of the system 
identification and validation data sets by 70%, and 285%, respectively. If we set 100% as 
the maximum allowable percentage of exceedance for both requirements, the dynamic model 
associated with 9 a will be outside of the validation domain even though its prediction of the 
system identification output is sufficiently accurate. Note also that even though 9 and 9 V 
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yield practically identical predictions, e s (0) < e s (6 v ) and e v (#) > e v (6 v ). The inequality 




Figure 5. Error difference between measurement and prediction for 2 sets of model parameters. 

constraint describing the satisfaction of both requirements for the same admissible error is 
g — [9ii92\ = [e s (0) — e,e v (6) — e] < 0. The parameter realizations satisfying both of these 
inequalities comprise the validation domain for both requirements. The boundary of V(e) 
can be displayed by plotting contours of the function e av (9) = ma x{e s (0) , e v (0)} , which is 
the worst-case requirement function in Eq. (18). 

The right subplots in Figure 3 show 2-dimensional contours of e sv (6) within the 95% 
confidence box A (0, 0.05). In this setting the maximal-margin parameter estimate, which is 
the parameter realization minimizing e sv marked with a x sign, differs from the maximum 
likelihood estimate 6, which is the parameter realization minimizing e s marked with a + 
sign. Notice that there are values of e for which 6 is outside the validation domain. This 
is the case when e = 1.025, which is the value leading to the validation domain boundary 
shown in black. This implies that the parameter estimate leading to the best value of e s (i.e., 
best estimate regarding the first requirement) yields a value of e v that exceeds admissible 
levels. The maximal-margin parameter estimate, which is not even at the geometric center 
of V(1.025), is 9 C = [-0.6280, 0.9060, -0.1587, -3.7862, -1.2212, -6.5461]. As expected, the 
minimum of e sv (9 5 , %) and of e sv ( 6 1, 0 2 ) are larger than the minimum of both e s (d 5 , d 6 ) and 
e v (6i, 6*2). Values of e below such a minimum yield an empty validation domain, i.e., there 
are no parameter realizations satisfying both requirements. The comparison between the 
left and right subplots of Figure 3 shows that the imposition of an additional requirement 
substantially increases the error spread within A. In contrast to the single requirement 
case shown to the left, the validation domains corresponding to small values of e are no 
longer ellipsoids. Furthermore, note that for large values of e, the principal axes of the 
family of ellipsoid- like curves shown to the right slightly differ from those shown to the left. 
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Figure 6. Cumulative errors associated with model predictions for 3 sets of model parameters. 


These observations illustrate that the imposition of additional requirements, even in this 
simple case for which the structure of both requirements uses the same error used by the 
maximum-likelihood approach, can substantially affect both the maximal-margin parameter 
estimate and the geometry of the validation domain. Figure 3 illustrates how the imposition 
of an additional requirement reduces the size of the validation domain while ensuring that V 
corresponding to a given set of m requirements is contained by the V corresponding to any 
subset of m — 1 requirements out of these m. 

Next, we calculate the PSM function Pi nne r(e) for the center 6 C = 0, and the fixed aspect 
vector m = er/||cr||. Recall that the validation domain being bounded is a set in the 6- 
dimensional 9 space. As before, we will consider a case with a single requirement based on 
Eq.(37), and another case with both requirements based on Eqs. (37) and (38). Figure 7 
shows the ratio Pmner (e)/ ||cr|| for the one requirement case (red dashed line) and the two 
requirement case (blue solid line). For a given inner bounding box one must accept greater 
error in order to satisfy two requirements vs. only one requirement. For example, the largest 
error within the box A (9, a ) , which corresponds to the ordinate value 2 in Figure 7, is about 
75% larger than the smallest attainable error. In contrast, the two requirement case leads 
to maximum errors that are about 265% larger than the minimum error. As it was observed 
in the 2-dimensional plots above, the imposition of the second requirement considerably 
increases the prediction error within the box. Note that the value of e where p j nn er (e ) = 0 is 
1.07 in the two requirement case and 1.0 in the one requirement case. This is consistent with 
the increase of the minimum of e sv mentioned above. In this case the second requirement is 
the worst-case requirement for all values of e (the PSM function corresponding to the second 
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Figure 7. /•’inner!' )/! ° for the one and two requirement cases having m = a. 


requirement alone will lead to the same blue line). In general, the individual requirement 
that prescribes the worst-case requirement depends on the value of e. 

Next, we calculate inner bounds of V(e) for the two requirement case using the optimal- 
m formulation. Figure 8 shows the corresponding PSM, p iririer . Recall that the aspect 
vector fhinner maximizes the volume of the inner bounding set. In this case the opti- 
mal minner changes slightly with e. While the aspect vector used in Figure 7 is m = 
(t/||<t|| = [0.24,0.12,0.24,0.38,0.30,0.79], the optimal ones are closely scattered about m = 
[0.36,0.12,0.30,0.55,0.18,0.64]. 

Next, we calculate outer bounds of V(e) for the two requirement case using the optimal-m 
formulation. Figure 8 shows the corresponding PSM, p outer . Recall that the aspect vector 
m 0 uter minimizes the volume of the outer bounding set. In contrast to the previous case, the 
optimal ihouter changes considerably with e. Figure 8 indicates that the volume of the outer 
bounding set is several orders of magnitude larger than the volume of the inner bounding 
set. This difference indicates that the maximum likelihood estimate 0 is either far from the 
centroid of V(e) (as the left plots of Figure 3 suggest) or that G'[V] has long spikes intruding 
into the validation domain. 

Finally, the sensitivity analysis of Section IV is used to rank the six parameters in 9. To 
this end we will use the bounding set of V(1.15). Therefore, prediction errors that exceed 
in up to 15% the minimum attainable e 2 error for both the system identification data set 
and validation data set are admissible. The optimal PSM, optimal aspect vector, and vol- 
ume of the inner bounding set are Pmner = 0.0332, m iimer = [0.3914,0.1048,0.2902,0.4574, 
0.2120, 0.7052] and Vol(£ iimer ) = 7.0154 x 10 11 , while the values corresponding to the outer 
bounding set are p outer = 0.3261, m 0 uter = [0.2204,0.3945,0.2554,0.7975, 0.2198,0.2147], 
and Vol(£>outer) = 0.06445. The dimension reduction error, J(r)/Vol(77) for f = 6% . . .9 e , 
corresponding to B 0 uter are shown in Figure 9. Vertical lines at the values of the maximum 
likelihood estimate are superimposed. The value of J(r)/ Vol(77.) at the maximum likeli- 
hood estimate as well as the derivative at such a point, cannot be used to describe global 
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Figure 8. PSMs and volumes of the optimal-ra bounding sets as a function of the admissible 
error e. 


sensitivities. The sensitivities u> r and //,. can be readily evaluated from these functions. Sen- 
sitivities based on the maximal dimension reduction error are = 0.0161, ujq 2 = 0.1729, 
uj $ 3 = 0.0157, u ]g 4 = 0.0229, u>g 5 = 0.0432, and u>g 6 = 0.0013. This yields the parameter rank- 
ing in decreasing order of importance [2, 5, 4, 1, 3, 6]. Conversely, sensitivities based on the 
mean dimension reduction error are ng ± = 0.0033, /j,q 2 = 0.0475, /j,g 3 = 0.0038, no A = 0.0054, 
/i0 5 = 0.0086, and y,g e = 0.0003. The parameter ranking according to /i is [2, 5, 4, 3, 1, 6]. 
In both cases there is a two orders of magnitude difference between the most and the least 
important parameters. The functions shown in Figure 9 illustrate how J, thus the sensi- 
tivities, varies within 77. For instance, J(r)/Vol(77) for f = % at the maximum likelihood 
estimate 9 5 = —1.2015 is 0.0011. This is about 400% and 80% smaller than the sensitivity 
for the maximal-margin and mean value respectively over 77. Hence, variation within 77 
caused by variations in e will yield different sensitivity values and rankings. In the context 
of the maximum likelihood approach, a sensitivity metric to 0i is 1 / a, . This metric leads 
to values [46.78,96.06,48.13,30.40,37.66,14.63] and an associated ranking of [2, 3, 1, 5, 4, 6] 
which differs from the two rankings above. The ratio between the most and the least im- 
portant parameters is 6.56, 158.3, and 133 according to l/dj, w r , and fi r respectively. These 
major differences are to be expected since the conventional approach is based on local sen- 
sitivities of the output y, whereas the global sensitivities proposed account for the effects of 
the model’s parameters on several figures of merit over the entire validation domain. 
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Figure 9. J (r) /\o\(JZ) for r = 0^ for z = 1, ... 6. 


VII. Discussion 

In this section a qualitative comparison is made between conventional uncertainty quan- 
tification and model validation techniques and those proposed herein. 

For the conventional approach of Section II, the dynamic model evaluated at the maximum- 
likelihood parameter estimate yields a prediction that optimally matches the measured data 
from a single flight test maneuver. As such, the identified parameter value is based on a 
single flight experiment and the particular performance metric in Eq. (5). There is no theory 
or rationale for assessing the error incurred by evaluating the dynamic model at any other 
realization of the model’s parameters. The accuracy of the identified model parameters using 
this approach depends on the suitability of the model for representing the system dynamics, 
the characteristics of the measurement noise //. and the effectiveness of the input in exciting 
the system response in the desired frequency range. The uncertainty set that results from ap- 
plying this approach, which uses local sensitivities evaluated at the parameter estimate, has 
a statistical foundation solely based on the underlying assumptions prescribing the measure- 
ment noise. Data from multiple flight test maneuvers can be used to refine both the dynamic 
model and the uncertainty set, but this must be achieved by concatenating the data from 
the multiple maneuvers, or by using sequential calculations in a Bayesian formulation. 1 

Conversely, the framework proposed here bounds the set of model parameters for which 
the empirical model is requirement compliant, e.g., the model prediction accurately describes 
the measured output corresponding to several maneuvers within an admissible level of error. 
The relationship between this set of parameter values (i.e., the validation domain) and the 
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prediction error is verifiable and objective. The validation domain can be systematically 
reduced by imposing additional requirements or by using data from additional flight test 
maneuvers. Both the structure of the dynamic model and the requirements can assume 
arbitrary forms, e.g., a nonlinear system model can be subject to a matching error based 
on an Zoo-norm, or a linear system model can be subject to requirements in the frequency 
domain. The calculation of the validation domain is not based on local sensitivities and 
assumed probability density functions, but rather on the ability of the model to satisfy the 
requirements imposed upon it. 

According to conventional methods, the value of a,j is related to the inverse of the sen- 
sitivity of the model prediction y{9 ) to variations in 9 :] , i.e., high sensitivity to a particular 
parameter is reflected in a correspondingly tight uncertainty range, cf. Eq. (10). This sensi- 
tivity metric solely depends on a derivative evaluated at a single parameter realization, and 
thereby it is local in nature. In contrast, the sensitivities in Section IV evaluate the range of 
variability of the requirements functions as 9 varies over the entire validation domain. The 
results of local and global sensitivity analyses will coincide when the dependence of g on 9 
is almost linear over the validation domain. Local and global sensitivities will likely differ in 
most other cases. 

The variation in the model predictions due to uncertainty in the model parameters is 
commonly studied using Monte Carlo analysis. This analysis often uses a prescribed proba- 
bilistic uncertainty model for 9 , typically in the form of a joint probability density function 
defined over some support set 5cR n . Ideally, the support set <S should not extend beyond 
the validation domain. This will guarantee that only the validated model parameters will be 
taken into account during the analysis. Unfortunately, the complex geometry of V usually 
precludes this practice. The best alternative is for S to contain the validation domain as 
tightly as possible. This can be attained by making S equal to the outer bounding set of 

V resulting from the free-m formulation proposed herein. While an analysis based on this 
uncertainty model considers all the validated dynamic models as desired, parameter values 
in S that are not in V, for which the model prediction is invalid, will also be included. When 

V is not fully contained by S, elements of the unknown parameter space that adequately 
predict the observations will be omitted from the study. 

In the conventional approach the uncertainty set in Eqs. (12, 13) depends on the value 
r], which, in turn, depends on the analyst’s choice for the confidence level. As such, the 
results and conjectures derived from a subsequent Monte Carlo analyses for which S = A 
will be subjective. By contrast, analyses for which S = £> outcr will be exempt from this type 
subjectivity. To totally eliminate the subjectivity in the prescription of the support set, 
the center of £> ou ter, 9 C , must be an additional design variable in Eq. (27). Note that the 
framework proposed does not relate the validation domain to the “true” value of 9, which 
could well be an unknown constant or an aleatory variable, nor does it yield information 
about likelihoods within V. The probabilistic distribution within the support set S, which 
is usually set according to past experience and expert judgement, will remain subjective. 

Knowledge of the validation domain for empirical parametric models has many practical 
uses. For example, Monte Carlo analysis for evaluating mission success in the presence of 
modeling uncertainty requires this information. The accuracy of model uncertainty charac- 
terization affects not only the predicted probability of mission success, but also influences 
mission cost and risk. Because the unknown parameters in empirical dynamic models for 
aircraft typically represent aerodynamic characteristics (such as stability and control deriva- 
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tives), knowledge of the uncertainty in the parameter estimates is essential for evaluating 
correlation with results from ground-based aerodynamic prediction methods such as wind 
tunnel testing and computational fluid dynamics. The resulting comparisons can inform 
and guide efforts for improvement in the ground-based prediction methods by indicating 
which results are in agreement, based on model performance, and which exhibit a meaning- 
ful difference. Furthermore, many critical evaluations related to aircraft stability, control, 
and fault detection depend on deciding when the values of dynamic model parameters cross 
some threshold values. Distinguishing real parameter variations from random statistical 
variations associated with the parameter estimation process is critical for these tasks, and 
requires good uncertainty measures. 

VIII. Concluding Remarks 

This paper develops a framework for the parameter estimation and validation of empirical 
dynamic models subject to an arbitrary set of validation criteria. The validation requirements 
imposed upon the model, which might include several sets of input-output data and arbitrary 
specifications in time and frequency domains, are used to determine if model predictions are 
within admissible error limits. This analysis is used to calculate parameter estimates, to 
characterize the uncertainty in their values, and to validate the resulting empirical model 
against several figures of merit. In contrast to standard approaches in which the only source of 
discrepancy between the observations and the prediction is assumed to be measurement noise, 
this approach casts all discrepancies (e.g., those caused by measurement noise, model-form 
uncertainty, unsteady aerodynamics, etc.) as parametric uncertainty. This work sets forth 
a new paradigm for deriving, characterizing and validating empirical dynamic models. 
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